17 research outputs found

    Enhancement of emotion recogniton using feature fusion and the neighborhood components analysis

    Get PDF
    Abstract Feature fusion is a common approach to improve the accuracy of the system. Several attemps have been made using this approach on the Mahnob-HCI database for affective recognition, achieving 76% and 68% for valence and arousal respectively as the highest achievements. This study aimed to improve the baselines for both valence and arousal using feature fusion of HRV-based, which used the standard Heart Rate Variability analysis, standardized to mean/standard deviation and normalized to [-1,1], and cvxEDA-based feature, calculated based on a convex optimization approach, to get the new baselines for this database. The selected features, after applying the sequential forward floating search (SFFS), were enhanced by the Neighborhood Component Analysis and fed to kNN classifier to solve 3-class classification problem, validated using leave-one-out (LOO), leave-one-subject-out (LOSO), and 10-fold cross validation methods. The standardized HRV-based features were not selected during the SFFS method, leaving feature fusion from normalized HRV-based and cvxEDA-based features only. The results were compared to previous studies using both single- and multi-modality. Applying the NCA enhanced the features such that the performances in valence set new baselines: 82.4% (LOO validation), 79.6% (10-fold cross validation), and 81.9% (LOSO validation), enhanced the best achievement from both single- and multi-modality. For arousal, the performances were 78.3%, 78.7%, and 77.7% for LOO, LOSO, and 10-fold cross validations respectively. They outperformed the best achievement using feature fusion but could not enhance the performance in single-modality study using cvxEDA-based feature. Some future works include utilizing other feature extraction methods and using more sophisticated classifier other than the simple kNN

    Comparing features from ECG pattern and HRV analysis for emotion recognition system

    No full text
    Abstract We propose new features for emotion recognition from short ECG signals. The features represent the statistical distribution of dominant frequencies, calculated using spectrogram analysis of intrinsic mode function after applying the bivariate empirical mode decomposition to ECG. KNN was used to classify emotions in valence and arousal for a 3-class problem (low-medium-high). Using ECG from the Mahnob-HCI database, the average accuracies for valence and arousal were 55.8% and 59.7% respectively with 10-fold cross validation. The accuracies using features from standard Heart Rate Variability analysis were 42.6% and 47.7% for valence and arousal respectively for the 3-class problem. These features were also tested using subject-independent validation, achieving an accuracy of 59.2% for valence and 58.7% for arousal. The proposed features also showed better performance compared to features based on statistical distribution of instantaneous frequency, calculated using Hilbert transform of intrinsic mode function after applying standard empirical mode decomposition and bivariate empirical mode decomposition to ECG. We conclude that the proposed features offer a promising approach to emotion recognition based on short ECG signals. The proposed features could be potentially used also in applications in which it is important to detect quickly any changes in emotional state

    Enhancing emotion recognition from ECG signals using supervised dimensionality reduction

    No full text
    Abstract Dimensionality reduction (DR) is an important issue in classification and pattern recognition process. Using features with lower dimensionality helps the machine learning algorithms work more efficient. Besides, it also can improve the performance of the system. This paper explores supervised dimensionality reduction, LDA (Linear Discriminant Analysis), NCA (Neighbourhood Components Analysis), and MCML (Maximally Collapsing Metric Learning), in emotion recognition based on ECG signals from the Mahnob-HCI database. It is a 3-class problem of valence and arousal. Features for kNN (k-nearest neighbour) are based on statistical distribution of dominant frequencies after applying a bivariate empirical mode decomposition. The results were validated using 10-fold cross and LOSO (leave-one-subject-out) validations. Among LDA, NCA, and MCML, the NCA outperformed the other methods. The experiments showed that the accuracy for valence was improved from 55.8% to 64.1%, and for arousal from 59.7% to 66.1% using 10-fold cross validation after transforming the features with projection matrices from NCA. For LOSO validation, there is no significant improvement for valence while the improvement for arousal is significant, i.e. from 58.7% to 69.6%

    Bivariate empirical mode decomposition for ECG-based biometric identification with emotional data

    Get PDF
    Abstract Emotions modulate ECG signals such that they might affect ECG-based biometric identification in real life application. It motivated in finding good feature extraction methods where the emotional state of the subjects has minimum impacts. This paper evaluates feature extraction based on bivariate empirical mode decomposition (BEMD) for biometric identification when emotion is considered. Using the ECG signal from the Mahnob-HCI database for affect recognition, the features were statistical distributions of dominant frequency after applying BEMD analysis to ECG signals. The achieved accuracy was 99.5% with high consistency using kNN classifier in 10-fold cross validation to identify 26 subjects when the emotional states of the subjects were ignored. When the emotional states of the subject were considered, the proposed method also delivered high accuracy, around 99.4%. We concluded that the proposed method offers emotion-independent features for ECG-based biometric identification. The proposed method needs more evaluation related to testing with other classifier and variation in ECG signals, e.g. normal ECG vs. ECG with arrhythmias, ECG from various ages, and ECG from other affective databases

    MCU-based isolated appealing words detecting method with AI techniques

    No full text
    Abstract Bullying in campus has attracted more and more attention in recent years. By analyzing typical campus bullying events, it can be found that the victims often use the words “help” and some other appealing or begging words, that is to say, by using the artificial intelligence of speech recognition, we can find the occurrence of campus bullying events in time, and take measures to avoid further harm. The main purpose of this study is to help the guardians discover the occurrence of campus bullying in time by real-time monitoring of the keywords of campus bullying, and take corresponding measures in the first time to minimize the harm of campus bullying. On the basis of Sunplus MCU and speech recognition technology, by using the MFCC acoustic features and an efficient DTW classifier, we were able to realize the detection of common vocabulary of campus bullying for the specific human voice. After repeated experiments, and finally combining the voice signal processing functions of Sunplus MCU, the recognition procedure of specific isolated words was completed. On the basis of realizing the isolated word detection of specific human voice, we got an average accuracy of 99% of appealing words for the dedicated speaker and the misrecognition rate of other words and other speakers was very low

    A video-based DT–SVM school violence detecting algorithm

    No full text
    Abstract School bullying is a serious problem among teenagers. School violence is one type of school bullying and considered to be the most harmful. As AI (Artificial Intelligence) techniques develop, there are now new methods to detect school violence. This paper proposes a video-based school violence detecting algorithm. This algorithm first detects foreground moving targets via the KNN (K-Nearest Neighbor) method and then preprocesses the detected targets via morphological processing methods. Then, this paper proposes a circumscribed rectangular frame integrating method to optimize the circumscribed rectangular frame of moving targets. Rectangular frame features and optical-flow features were extracted to describe the differences between school violence and daily-life activities. We used the Relief-F and Wrapper algorithms to reduce the feature dimension. SVM (Support Vector Machine) was applied as the classifier, and 5-fold cross validation was performed. The accuracy was 89.6%, and the precision was 94.4%. To further improve the recognition performance, we developed a DT–SVM (Decision Tree–SVM) two-layer classifier. We used boxplots to determine some features of the DT layer that are able to distinguish between typical physical violence and daily-life activities and between typical daily-life activities and physical violence. For the remainder of activities, the SVM layer performed a classification. For this DT–SVM classifier, the accuracy reached 97.6%, and the precision reached 97.2%, thus showing a significant improvement

    A multi-sensor school violence detecting method based on improved relief-F and D-S algorithms

    No full text
    Abstract School bullying is a common social problem, and school violence is considered to be the most harmful form of school bullying. Fortunately, with the development of movement sensors and pattern recognition techniques, it is possible to detect school violence with artificial intelligence. This paper proposes a school violence detecting method based on improved Relief-F and Dempster-Shafe (D-S) algorithms. Two movement sensors are fixed on the object’s waist and leg, respectively, to gather acceleration and gyro data. Altogether nine kinds of activities are gathered, including three kinds of school violence and six kinds of daily-life activities. After wavelet filtering, 39 time-domain features and 12 frequency-domain features are extracted. To reduce computational cost, this paper proposes an improved Relief-F algorithm which selects features according to classification contribution and correlation. By drawing boxplots of the selected features, the authors find that the frequency-domain energy of the y-axis of acceleration can distinguish jumping from other activities. Therefore, the authors build a two-layer classifier. The first layer is a decision tree which separates jumping from other activities, and the second layer is a Radial Basis Function (RBF) neutral network which classifies the remainder eight kinds of activities. Since the two movement sensors work independently, this paper proposes an improved D-S algorithm for decision layer fusion. The improved D-S algorithm designs a new probability distribution function on the evidence model and builds a new fusion rule, which solves the problem of fusion collision. According to the simulation results, the proposed method has increased the recognition accuracy compared with the authors’ previous work. 89.6% of school violence and 95.1% of daily-life activities were correctly recognized. The accuracy reached 93.6% and the precision reached 87.8%, which were 29.9% and 2.7% higher than the authors’ previous work, respectively

    A school violence detection algorithm based on a single MEMS sensor

    No full text
    Abstract School violence has become more and more frequent in today’s school life and caused great harm to the social and educational development in many countries. This paper used a MEMS sensor which is fixed on the waist to collect data and performed feature extraction on the acceleration and gyro data of the sensors. Altogether nine kinds of activities were recorded, including six daily-life kinds and three violence kinds. A filter-based Relief-F feature selection algorithm was used and Radial Basis Function (RBF) neural network classifier was applied on them. The results showed that the algorithm could distinguish physical violence movements from daily-life movements with an accuracy of 90%

    Speech interactive emotion recognition system based on random forest

    No full text
    Abstract In daily life, speech is the main medium of human communication, and interpersonal communication is emotional. People hope that the computer can give a response based on the emotions contained in the voice. In this paper, we build a Wechat program of speech emotion recognition system, which is based on a random forest classifier. Firstly, the system preprocesses the collected speech signals in order to reduce noise. Secondly, 16 acoustic features are extracted from the pre-processed speech signals. The system obtains the emotional features of speech by applying 12 statistical functions to the original acoustic features. The emotional classification of Berlin Speech Emotion Database uses two classifiers: the Random Forest Classifier and the Support Vector Machine. The recognition accuracy of the SVM classifier is 83%. The accuracy of the random forest classifier is 89%. Finally, the random forest classifier is used to build the speech emotion recognition system
    corecore